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^ Abstract 

^' 

O In this manuscript, we are concerned witii data generated from a diffusion ten- 

^ — I 
l; sor imaging (DTI) experiment. Tine goal is to parameterize manifold-like white matter 

rS tracts, such as the corpus callosum, using principal surfaces. We approach the prob- 

ed 

lem by finding a geometrically motivated surface-based representation of the corpus 

callosum and visualize the fractional anisotropy (FA) values projected onto the surface; 

the method applies to any other diffusion summary as well as to other white matter 

tracts. We provide an algorithm that 1) constructs the principal surface of a corpus 

callosum; 2) flattens the surface into a parametric 2D map; 3) projects associated FA 

values on the map. The algorithm was applied to a longitudinal study containing 466 
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diffusion tensor images of 176 multiple sclerosis (MS) patients observed at multiple 
visits. For each subject and visit the study contains a registered DTI scan of the cor- 
pus callosum at roughly 20,000 voxels. Extensive simulation studies demonstrate fast 
convergence and robust performance of the algorithm under a variety of challenging 
scenarios. 
Keywords: corpus callosum, principal curves and surfaces, thin plate splines 



1 Introduction 

This research is motivated by the need to establish a parametric description of the corpus 
callosum structure, a neuronal structure with a curved shape. The corpus callosum is a 
wide, flat and curved bundle of neural fibers beneath the cortex in the eutherian brain at 
the longitudinal fissure. It connects the left and right cerebral hemispheres and facilitates 
interhemispheric communication. As the largest white matter structure in the human brain, 
it consists of 200 to 250 million contralateral axonal projections. 

The left panel in Figure [1] displays that the corpus callosum appears as a two dimen- 
sional manifold in its principal structure. It is curved towards the inferior part of the brain 
on both the anterior and posterior sides. A different perspective is shown in the middle 
panel on the right. Though the corpus callosum lies in a three dimensional space, its l<ey 
structure is intuitively that of a "carpet" that lies in a two dimensional manifold. Therefore, 
dimension reduction techniques may provide strong data compression along with novel 
visualization and parametrization approaches that could be easy to use in practice. 

To achieve this, the first step is to estimate the center surface of the corpus callosum. 
The second step is to obtain the projection of each data point; these projections can then 
be used to map the diffusion properties of the corpus callosum onto the 2D manifold. 

The focus of our work is on principal surfaces for dimension reduction. Before we 
describe them, we outline other potential methods. Principal component analysis (PGA) 
is one of the most useful tools for dimension reduction. PGA finds directions (vectors) in 
the space that explain the largest variability of the data, while constraining directions of 
variation to be orthogonal. 



Though PGA is a widely used strategy for dimension reduction, it can lead to over sim- 
plification. As shown in the left panel in Figure|2| the underlying true curve that generates 
the data is part of a circle while the first principal component is a line. Direct applications 
of splines, wavelets and related regression methods cannot be done as they require the 
mean function of the data to be in a functional form; that is, there is only one y for every 
X. This is clearly violated in the corpus callosum example and many other white matter 



manifolds of interest. |Palus and Dvorak| [291 illustrated the pitfalls and precautions when 
applying linear RCA in non-linear settings. In our case, since the corpus callosum clearly 
is a nonlinear structure, PGA is not a viable candidate for dimension reduction. 

IVIany non-linear methods have been proposed for fitting non-linear data structures. 
As an example, [10] proposed a non-linear extension of PGA. The core idea is to include 
product combinations of the variables in the data matrix. Another useful tool - the self- 
organizing map (SOIVI) - was proposed in [201, (21]. SOIVIs are unsupervised learning 
procedures which are used to discover structure in the data. Other nonlinear method 
methods such as non-linear principal component analysis (NLPGA) ||23l, |[22l and principal 
geodesic analysis (PGA) [8], [7] can also fit non-linear data structures. 

An important concept in doing non-linear data compression is principal surfaces |T7], 
(TS). Principal surfaces are manifolds that pass through the middle of the data. The sur- 
faces are fit via nonparametric low-dimensional manifolds that minimize the orthogonal 
distance from the data to themselves. Principal surfaces satisfy a self consistency cond\- 
tion, in that they are the conditional expectation (local average) of the data. The middle 
panel of Figure [2] shows the difference between the principal curves (surfaces) and re- 
gression. It highlights that the principal curve minimizes the sum of orthogonal distances 



while a spline model fit tries to minimize the sum of distances parallel to the y axis. 

The principal surface is not unique. For example, it has been proven ^7\, HS] that 
all the principal components of a dataset are self-consistent. Other factors may also im- 
pact the results of fitting principal curves and surfaces. Using another example, different 
smoothness and lengths also lead to different principal curves. The original algorithm 
proposed in [18J used an iterated algorithm with local planar smoother to achieve the 
principal surfaces. Motivated by multivariate adaptive regression splines (MARS), adap- 
tive principal surface is proposed later in mU. |Dong and McAvoy| [H combines principal 



curves and neural networks. The authors construct a principal curve via a two-step neural 
network where the nonlinear "layer" is the construction of principal curve. |Einbecketal.||[6| 



proposed an algorithm for fitting local principal surfaces. The surface is connected via tri- 
angles and the main work is then to find adjacent vertices. [Goldsmith et al.[ [l2]| proposed 



a tube fitting algorithm based on principal curve method. Other algorithms have been pro- 
posed recently, such as fitting principal surfaces via kernel map manifold (KMM methods) 
[9] and constructing local principal curves and surfaces using subspace constraint mean 
shift (SCMS) I2Z1. 

Some local search methods though could fit the principal surface, may not have equally 
spaced parametrization on the surface. This will definitely lead to poor visualization of the 
flattened surface. In addition, some previous work can only achieve locally planar surfaces 
while for corpus callosum and other white matter tracts, a surface with certain degree of 
smoothness would be more reasonable. Therefore, we develop a method of obtaining a 
smoothed principal surface of the data with better parametrization on the surface. The 
rest of the manuscript is laid out as follows: In section 2, the principal surface concept 



will be introduced, and its corresponding algorithm will be shown. We will show how the 
principal surface algorithm works on simulated data in section 3. The algorithm then is 
applied to corpus callosum data and the FA maps are obtained in section 4. We will 
conclude the whole paper in section 5. 

2 Methods 

2.1 Principal Surfaces 

Let Xj = {xii, Xi2, Xjs)^, « = 1, . . . , / be the data points in three dimensional space, 7^^ fol- 
lowing an underlying distribution. U = {tii,ti2f be corresponding parametrization points 
in two dimensional space, TZ^. In addition, we will require (without loss of generality) that 
the 2D coordinate space be the unit square [0, 1] x [0, 1] c TZ'^. We define / as the smooth 
principal surface function / : t^ h- /(tj) that maps from TZ'^ to TZ^. The principal surface 
function satisfies the se/^cons/stency condition: 

E(X|A/(X) =t) = /(t) forallt, (1) 

where A/(x) = sup^ {t : ||x-/(t)|| = inf^ ||x-/(/i)||} is the projection function with respect 
to /. The projection function maps a data point on to the closest principal surface point 



having the largest parametrization. [Hastie and Stuetzle|p8] showed that principal curves 



and surfaces generally exist, though are not unique. We have found that there are two 
main distinctions between different fitted principal surfaces: the degree of smoothness 
and the method of parametrization. In most algorithms, these properties will be controlled 
by the specific smoother being used in the algorithm and its tuning parameters. The 



details of our specific algorithm will be demonstrated in the next section. 

2.2 Algorithm 

In this paper, the algorithm used in US) will be modified. This algorithm allows us to find 
the surface coordinate for each data point {ti,i = !,...,/), which we will use later to 
create parametric summaries. However, the original principal surface algorithm can only 
yield surfaces which are locally flattened. Therefore, instead of local planar smoothers, we 
employ thin-plate splines (TPS) for fitting the surface. Thin-plate splines were proposed by 
|5J and are now widely used for bivariate smoothing. The TPS penalize the least squares 
error by a high-order derivative term in order to achieve a desired degree of smoothness. 



Wood] [32] improved the computational efficiency when fitting TPS by using an optimal 
approximating basis that we employ. 

Preprocessing. Recall the notation we defined in section 2.2, let X = [xi, . . . ,x/]^ be 
the / X 3 matrix that contains the 3D coordinates of the dataset. We assumed that the 
data are centered around the origin. Principal component decomposition is then applied. 
Let X = USV^ be the singular value decomposition of X. Then US are the first two 
principal "scores" of the data matrix, where S is a submatrix of S containing the first two 
columns. We then standardized the scores so that both scores are in [0, 1]. and used 
them as the initial 2D parametrization. 

Conditional Expectation. Suppose U = (tn, ti2Y be the current parametrization of Xj 
in two dimensional space, [0, 1] x [0, 1]. For a specific data point xq, with its coordinate to, 
we choose r as a radius, such that all the other data points with their projection coordinate 



having less distance from to than r are considered as the neighbor of x, see the left 
panel in Figure |3| Let A/'xo be the set of all the neighbors of the point xq. Then we have 
A/'xo = {xj : \\tj - toll < r] and we calculate the local weighted average as follows: 

exp{-^} 



x'- = ^<x, Where < := l[x,eAr.j 



X 



Smoothing. Bivariate thin plate splines |311 are applied after we obtained these local 
averages: x^™, i = 1,.. . ,1. In this step, we fitted 



^id — fdlitil) + fd2(ti2) + fdsitily U2) + Cid, C? — 1, 2, 3, 



(2) 



and obtained a bivariate thin plate spline smoothing of the local average points, which is 
demonstrated in the right panel of Figure [3} Here 



f{U) 



fllitil) + fl2\ti2) + fl3{til,ti2) 
f2l{til) + f22{ti2) + f2-i\til,ti2) 
fsiitii) + f32(ti2) + f33itil,ti2) 



(3) 



is the current principal surface function mapping from the 2D parametrization space to 3D 
coordinate space. The thin plate splines could be fitted using "mgcv" pacl^age in software 
R. 

Projection. We then projected each data point onto the current principal surface and 
obtained new 2D parametrization. we used a grid search method to find the projection 
and only search within the range [0, 1] x [0, 1]. Therefore there will be some data points 
being projected onto the boundary, which brings some issues when we further analyzed 
the 2D parametrization. After this step, we iterated the whole procedure with the new 2D 



parametrization. 



The whole algorithm is illustrated as follows: 
Input: Data in 3D coordinate, X/xs 

Output: Principal surface function, fiz^-^TzA, and the 2D parametrization of all data 

points, T7X2 

Initialization: De-mean X for each column; 

set the initial 2D parametrization as T(°) = US; 

Set err= 1, i= 1; 

while (i <max.iterand err>thres) do 



(1). X'- ^- WX, where w,j = l[,wg^ 



.o)J^^- 



(2). Fit X'™ = /(T) + e; 

(3).t(— )^ar^ mint ||x,-/(t)f; 

(4). err < — HT'''^ - T("^^)||2, ii — i + 1; 

end 

Algorithm 1 : Principal surface algorithm 

3 Simulation Results 

3.1 Simulation Setup 

We illustrate four examples to investigate the algorithm in highly idealized test settings. 
The number of data points was set to / = 6, 000 in all simulation settings and we sampled 
a subset of 1, 000 points in each case. The data were centered around (0, 0, 0), which was 
assumed l^nown. 

Case 1 The data points in the first simulation case are uniformly distributed around a 
cylinder with an open seam. Set Oi ~ U(0,2n - 0.5), e^ ~ N{0,0.15^) and Zi ~ f/(-3,3) 



i.i.d., then let 



cos6'i(l + ei) 



sin6'j(l + ei) 



Case 2 The second case is based on the Himmelblau's function [19], which is f{x, y) = 
{x'^ + y- 11)2 _^ (2; + 2/2 _ 7)2. Let Zii ~ f/(-5, 5), z,2 ~ U{-5, 5), e, ~ A^(0, 50^) i.i.d., then 
let 

-Tm{(4 + ^^2 - 11)' + {za + 4 - 7)' + q} 



Case 3 In the third case the data points form a flatten surface at the beginning and 
then begin to bent over towards the bottom. Set zn ~ f/(0, 2), Zi^ = when i = 1,. . . , 1/2; 

Zii = cos(6'i)+2, Zis = -l+sm(6'i)forii = //2+1, . . . , J, where 6'i ~ f/(-7r/2,7r/2). Let ^^2 ~ 
f/(0, 10) and ti ~ f/(-0.4, 0.4). All the random numbers are generated independently. 

Then we let 

Zii 

Zi2 
Zi3 + Cj 



Case 4 In the last case, we produce a two dimensional stretched digit "5". Let zn ~ 

U{0,1), Zi3 = when i = l...,3//10; za = 0, Zis ~ U{-1,0) when i = 3//10 + 
1, . . . , 4.5//10; Zii ~ f/(0, 0.5), Zis = -1 when i = 4.5//10 + 1, . . . , 6J/10; za - ^ , ^ 



^ + Uos(9i), 



Zi3 



-| + 1 sin(0i) for i = 6//10 + 1, . . . , 8.5//10 where 9i ~ f/(-7r/2, n/2); za ~ t/(0, 0.5), 



10 



za = -2 when i = 8.5J/10 + !,...,/. Let Zi2 ~ f/(0, 5) and e, ~ t/(-0.15, 0.15) when 
z = 1, ...,/. All the random numbers are generated independently. Then we let 



~ 


Zii + ei 


Zi2 


za + ei 



3.2 Simulation Results 

The results of the principal surface fitting algorithm for four examples are shown in Figure 
|4} Starting in the upper left panel in Figure |4l one can see that the algorithm reconstructs 
the cylinder very well. Since we did not force any constraints in fitting the surface, we 
do not achieve a closed seam cylinder. (Finding principal surfaces for closed cylinders 
or spheres remains an interesting topic for future research.) Other upper panels illustrate 
the fitting results of the Himmelblau's function, a non-function shaped "carpet" data cloud 
and a simulated digit "5" data cloud. The four panels in the bottom panels in Figure |4] 
compare the fitted surfaces with the original surfaces that generate the data cloud. The 
result are satisfactory for all cases. Given that the desired corpus callosum model fit is far 
simpler and smoother than these examples, the simulations show evidence of the viability 
of the principal surface algorithm as a robust method for this problem. 

For all four scenarios, the algorithm converged in less than 20 steps and took under 
two minutes on a 13-2. 3GHz PC machine with 4Gb RAIVI memory. To further investigate 
practical computing speed, we set varied sample sizes and number of grid points [Ngrid) 
for the third case above and measured the iterations and compute time. The results 
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are illustrated in Table [T] As the number of grid points or the number of data points 
increases, the computing time for each iteration step increases correspondingly, though 
did not go out the range of reasonable compute times. Note that, one would rarely use the 
entire dataset in large sample size cases. Our recommended strategy is to take random 
subsamples from the dataset and subsequently fit the surface using the subsample, if 
necessary, repeating the sub-sampling to investigate its impact on model fits. In addition, 
as the grid number increases, the number of steps to attain convergence decreases. 

4 Application 

4.1 Fitting tlie Principal Surface of a Corpus Callosum 

The data contain 466 scans generated from a diffusion tensor imaging (DTI) experi- 
ment performed on 176 IVIS patients. For each scan, we have corresponding fractional 
anisotropy (FA) value for each voxel of the entire corpus callosum area extracted via 
tractography. IVIore detailed data description are provided in 1301 and [28]. Fractional 
anisotropy has been associated cross sectionally and longitudinally with multiple scle- 
rosis diagnoses and symptoms OJ], im, [Tel, [30J, [35J. Ignoring the FA value for the 
moment, we first discuss the principal surface fit. 

For each scan, there are roughly 20,000 data points in the corpus callosum region 
of interest. For computational simplicity, we randomly choose 1 ,000 of them to build the 
surface. We illustrate the result of four scans as well as the overall average fitted surface in 
the left panels in Figure |5} The fitted surface for each scan is reasonable and is indicative 
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of the fits from the other scans, each inspected visually. The overall average is smoother 
than each scan as we expected. 

4.2 Flattened FA Representation 

A primary goal from the principal surface fitting is to flatten the surface and to make a 2D 
FA image. That is, our goal is to use DTI-based morphometric information to create 2D 
images of IVIR contrast properties, such as FA maps. First, we project each data point 
onto the surface by a grid search method. Then we smooth the associated FA value on 
the surface by calculating a local average. Interpolating this smooth onto a grid yields 
a 100 X 100 2D FA image displayed in the right panels of Figure [5| That is, right panels 
represent the flattened surfaces of the corresponding left panels. The bottom panel shows 
the average FA map, which shows an overall visualization of 2D FA maps for all the scans. 
Another important fact is from left panels in Figure |5] and right panel in Figure [T| we can 
see that important information of the FA values is retained, which is to say, it may be 
sufficient to visualize the 2D FA values instead of the original 3D FA values. 

5 Discussion 

In this manuscript we introduced a principal surface algorithm and used it to fit the corpus 
callosum. The goal of this work is largely developmental, creating a handy tool for dimen- 
sion reduction in DTI analysis of a primary structure. The result of both the simulations 
and the application is encouraging. While applied to the corpus callosum, the algorithm 
could be applied to any other three dimensional manifold-like objects where a two di- 
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mensional surface could be embedded and is of interest, for example, all the major white 
matter tracts in the brain [1]. The two dimensional manifold characterizes the original data 
and accomplishes both dimension reduction and better visualization. The surface that we 
constructed is smooth and could be easily projected onto to represent other properties of 
the original structure, such as the FA, mean diffusivity, parallel diffusivity, local thickness 
and so on. The algorithm is computationally feasible and scales well to larger images and 
densely measured structures. 

The role of the two dimensional sub-representation needs to be further explored. For 
future work, we are developing functional data analysis tools OJ], (131, GF]- [34]- [33] for 
relating the dimension reduced 2D manifold to outcomes of interest for the purpose of 
inference, biomarker creation and prediction. Of note, we are particularly interested in 
whether or not the 2D representation of the corpus callosum is less sensitive to issues of 
whole brain registration often used in the processing pipeline. In fact, it is possible that 
registering the 2D representation is preferable to whole brain registration a priori in certain 
applications. We note also that IVIFPCA [3] and LFPCA [35] methods have been shown to 
isolate registration error as a part of the model [25], thus raising the intriguing possibility 
of DTI processing streams that dramatically decrease the need and importance of whole 
brain template-based registration. 

Furthermore, current works suggest sagittal mid-line callosum average thickness might 
be of very meaningful in the study of some diseases ||26l, [SI]- In contrast. The princi- 
pal surface approach allows us to calculate the local thickness over the main body of the 
structure. We then could construct a global thickness map for the whole corpus callosum, 
following a procedure similar to that of constructing the FA map. Based on this, the thick- 
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ness changes of white matter could be measured on the whole corpus callosum instead 
of the mid-sagittal slice. 

Finally, we note that the procedure that we proposed can be further extended as a gen- 
eral problem of fitting sl<eleton manifolds. First, we are interested in fitting surfaces with 
fixed boundaries. Worl< has been done to analyze principal curves with fixed starting and 
ending points [2]. The extension to surfaces would be challenging. Suppose t = (ti,t2) 
is the corresponding coordinate on the surface of the original data point, x = (a;i,x2,a;3), 
which lies in 3D space. We restrict the range of t to be [0,1] x [0,1]. Now consider a 
predetermined function for x(t) = (a;i(t),a;2(t),a;3(t)) when the boundary values of t are 
linearly constrained. Such constraints would yield cylindrical fits easily though extensions 
to completely closed surfaces would require more elaborate constraints. 
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Figure 1: Left panel: 3D-rendering of corpus callosum. Right panel: horizontal, sagittal, and coronal slices. 
Views: R=Right, L=Left, S=Superior, l=lnterior, A=Anterior, P=Posterior. Produced in SDSIicer (4.1.1) 
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Figure 2: Left panel is a illustration of different dimension reduction methods. The blue points are the 
original data points, the dot-dashed green line is the first principal component, the dashed black curve is 
the spline fitting and the solid red curve is one of the principal curves. Middle panel shows the difference 
between the principal curve and the regression method: top panel shows that the principal curve minimizes 
the orthogonal distance, bottom panel shows that the spline regression minimize the distance in y axis. 
Both panels in the middle use the same dataset. Right panel illustrates several different principal curves for 
one dataset. 
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Figure 3: Left panel illustrates the projection step and the conditional expectation step, right panel illus- 
trates the smoothing step 
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Figure 4: Simulation results. In the four top panels, the red-green dots are the original data points and the 
pink yellow ones are the fitted principal surfaces. The top panels from left to right show the results of (1 ) an 
cylinder with an open seam on one side, (2) the Himmelblau's function, (3) a non-function shaped "carpet" 
data cloud and (4) a simulated digit "5" data cloud. The four bottom panels compare the fitted surfaces 
with the original surfaces that generate the data cloud. The blue dots are the original surfaces and the pink 
yellow ones are the fitted principal surfaces. 
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Figure 5: In the left column, top four panels illustrates the the original corpus callosum from four different 
scans with their fitted principal surfaces. The scattered points are the original data points and the surface 
across the middle of them is the fitted principal surface. The bottom panel shows the average fitted principal 
surface for all the 466 scans. Red color suggests higher FA value while green means lower one. The right 
panels are the corresponding flattened surfaces. In the 2-D image, red color represents higher FA while 
green suggests lower one. 
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B Tables 



Sample 
Size 


Number of grid points, denotec 


i as Ngrid 




15 30 50 100 


15 30 50 100 


15 30 50 


100 


Total time usage (Sec.) 


Number of iterations 


Time per iteratior 


1 (Sec.) 


100 


4 4 4 8 


6 5 3 3 


1 1 1 


3 


200 


7 5 5 10 


11 6 4 3 


1 1 1 


3 


500 


18 12 11 20 


17 8 5 4 


1 1 2 


5 


700 


27 21 18 32 


20 12 7 5 


1 2 3 


6 


1000 


39 40 35 49 


22 15 10 6 


2 3 3 


8 



Table 1: Simulation computing time consuming comparison 
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